Objective of the Project:

Build a classifier to predict the Pass/Fail yield of a particular process entity and analyse whether all the features are required to build the model or not.

Build a classifier to predict the Pass/Fail yield of a particular process entity and analyse whether all the features are required to build the model or not.

Importing required libraries.

5 point summary defiened given as follows:

  1. The minimum.
  2. Q1 (the first quartile, or the 25% mark).
  3. The median (50%).
  4. Q3 (the third quartile, or the 75% mark).
  5. The maximum.

Univariate Analysis

Model Building

Balancing Target column

Training Model

Checking the statistical characteristics of train and test data with original data.

Checking the achieved train and test accuracies with the different sample population

Using other models with tuning

  1. KNeighborsClassifier
  2. SVM
  3. DecisionTreeClassifier
  4. RandomForestClassifier

1. KNeighborsClassifier

1. KNeighborsClassifier Tuning

2. SVM

2. SVM Tuning

3. DecisionTree Classifier

3. DecisionTree Classifier Tuning

4. RandomForest Classifier

4. RandomForest Classifier Tuning

Using models with CrossValidation

Using models with PCA

From the graph and the table, it is clear that ramdon forest is the best classifier among all

Pickling the model

Conclusion

All the predictions indicate Pass label for all the rows in the future data. There can be a possibility of misclassification in the predictions as the misclassification rate during training was 0.23%. And also, the precision was 99.53%.